QCS: A system for querying, clustering and summarizing documents

نویسندگان

  • Daniel M. Dunlavy
  • Dianne P. O'Leary
  • John M. Conroy
  • Judith D. Schlesinger
چکیده

Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system—the Query, Cluster, Summarize (QCS) system—which is portable, modular, and permits experimentation with different instantiations of each of the constituent text

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

QCS: A Tool for Querying, Clustering, and Summarizing Documents

The QCS information retrieval (IR) system is presented as a tool for querying, clustering, and summarizing document sets. QCS has been developed as a modular development framework, and thus facilitates the inclusion of new technologies targeting these three IR tasks. Details of the system architecture, the QCS interface, and preliminary results are presented.

متن کامل

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

Summarizing Massive Information for Querying Web Sources and Data Streams

OF THE DISSERTATION Summarizing Massive Information for Querying Web Sources and Data Streams

متن کامل

A trainable algorithm for summarizing news stories

This work proposes a trainable system for summarizing news and obtaining an approximate argumentative structure of the source text. To achieve these goals we use several techniques and heuristics, such as detecting the main concepts in the text, connectivity between sentences, occurrence of proper nouns, anaphors, discourse markers and a binary-tree representation (due to the use of an agglomer...

متن کامل

خوشه‌بندی فراابتکاری اسناد فارسی اِکس‌اِم‌اِل مبتنی بر شباهت ساختاری و محتوایی

Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Process. Manage.

دوره 43  شماره 

صفحات  -

تاریخ انتشار 2007